Contents Page

  1. Introduction & Current Market Practice
  2. App Overview
  3. Data Cleaning of Amenities & BTO
  1. Web Scraping & Data Transformation
  1. Data Analysis for the prediction model
  1. App Functions
  1. Findings
  2. Further Development
  3. Conclusion

1. Introduction & Current Market Practice

This application was developed to aid prospective Build-to-order (BTO) housing flat owners. Currently, there are many factors that one considers when buying a BTO - some factors include the presence of nearby facilities (transport: MRTs, education: schools, medical: clinics and entertainment: malls), the true value of the house in the event of selling, the historical prices of other flats in that area as well as comparison to condominiums and resale flats to evaluate other options. Thus, the “BTO Analysis Buddy” was created as an all-in-one solution, conveniently guiding the user step-by-step to arrive at a solution that fully meets their needs.

Currently, buyers have to utilise google and visit various websites to gather their own information. Examples of some websites include: Seedly’s BTO guide , MoneySmart’s BTO guide , “How I Bought My HDB Resale Flat” by TheSmartLocal and many more. However, it is time consuming and tedious to visit each page, and with every page presenting different information in a different way, it is challenging to put together and synthesize the various information. One would find it difficult to shortlist a suitable BTO, and even after shortlisting, one still has to search up each address into Google Maps to view the nearby amenities. This process can take weeks and there is much potential for it to be more efficient.

Currently, there is also no website or app that integrates BTO with HDB Resale or Condominiums, making it difficult for potential buyers to compare them. Although there are websites that list HDB Resale flats like PropertyGuru and SRX, they lack the convenience that our app provides, such as easy comparison and filtering by multiple criteria.

2. App Overview

The “BTO Analysis Buddy” Application has 4 tabs, each with different added functionality. Even though users can use each tab independently, for explanation purposes, the most common user journey is described as such:

  1. Home Page: Users can filter for housing type to view BTO, HDB resale or condominiums, as well as filter for nearby amenities. The map tiles will also show the estimated pricing range. After filtering, one can click on the marker to view details and add the listing to a comparison table.

  2. Comparison: Next, this tab allows one to easily compare between shortlisted flats. One can see some key features (i.e. number of rooms, size) as well as the predicted prices.

  3. Analysis: The Analysis tab gives one more information about the various factors that affect price based on a linear regression model. Thus, this can be an important factor in users’ decision-making process.

  4. Amenities: After deciding or shortlisting flats, this tab shows an in-depth view of the exact amenities near the flat that one has selected.

3. Data Cleaning of Amenities & BTO

All .rmd & htmlcodes for this section can be viewed under the file folder “01 Data Cleaning of Amenities & BTO”

MRT Data

The MRT data was taken from Data.Gov.Sg and also Kaggle. These files were then merged to get the address, station name, station and coordinates of the MRT stations.

Schools Data

As there are no datasets of primary and secondary schools readily available on the internet, we used the function “geocode” from the “ggmap” package to scrape for the geo coordinates of primary and secondary schools in Singapore. However as some of the latitude and longitude data scraped from Google maps is wrong, we had to do some data cleaning and changed those data manually.

Malls Data

The malls data was scrapped from the malls website to get all the shopping malls in Singapore listed in that website. Using geocode, we then get the malls latitude and longitude based on the addresses.

Nature Parks Data

The raw data of the names and location of the nature parks are gotten from data.gov.sg , as a kml file. Then, we converted the kml file into a dataframe and extracted the necessary data. Some data cleaning was done as the original data had unclean formats when converted into a dataframe.

Clinics Data

As there are many CHAS clinics around Singapore and since these clinics can be easily accessed, it is highly likely that most Singaporeans would visit these clinics when they fall sick. The CHAS clinics kml was taken from data.gov.sg and we scraped data from the kml file. There was no data cleaning done for this file as all information was already presented accurately.

BTO Data

The BTO data was obtained from HDB website.

4. Web Scraping & Data Transformation

Codes for this section can be found under “02 Webscraping & Data transformation”

HDB Resale Data

The HDB resale data was scraped from SRX website using the codes listed in CrawlSRX.Rmd file. As part of the codes, the longitude and latitude data was obtained by passing the postal code to an API in OneMap.sg.

Condomiumiun Resale Data

The condominium resale data was scraped from SRX website using the codes listed in CrawlSRX.Rmd file. As part of the codes, the longitude and latitude data was obtained by passing the postal code to an API in OneMap.sg.

Data transformation to obtain distance of nearest amenity from flat

For the files containing BTO, resale HDB flats and resale condominiums data, we added an additional column containing the minimum distance between the respective housing in all files and 5 different amenities (MRT stations, Parks, Malls, Schools and Clinics). This is done by creating a double for loop which loops through every housing in each file and also each amenity. We then append these minimum distances to the respective files containing the different housing types.

5. Data Analysis for the prediction model

Codes for this section can be found under “03 Data Analysis - Prediction Model”.

Preparing Data for Predictive Analysis

From the cleaned resale data, categorical variables are split into different columns through one-hot encoding. Data such as closest distance from different amenities (malls, mrt, parks, clinic and schools) are added as these might increase the accuracy of the prediction model. Age of the house was computed from the year in which the HDB was built. Number of bus stops around the house were also added. Please refer to “A. Prediction_cleaning” for the full cleaning codes.

#### Data Analysis for Prediction Model

After cleaning the data for prediction (from previous section), a linear regression model is made. Variables with less than 5% significance level are removed (apart from number of bus_stops as the team feels that this factor can be engineered to include distance from these bus stops in the future, which may increase its prediction). The model has an adjusted r-square of 81.97% which means that the model is fairly accurate in predicting the actual value.

Fig.6: Linear Regression Model

For the 5 years predicted value of HDB flats, the predicted value today is being adjusted by discounting at a rate based on its age 5 years later. The team used the bala table (a valuation table to get the percentage of freehold value: https://www.99.co/blog/singapore/wp-content/uploads/2020/04/GCDjTNq.png) to get the estimated discount rate, and add that to the expected inflation rate at 1.459% (https://www.ceicdata.com/en/indicator/singapore/forecast-consumer-price-index-growth) to obtain the 5 years predicted value.

6. App Functions

A. Home Page

Our homepage shows an interactive map that allows users to filter according to the type of apartments they want (e.g. BTO Flats, HDB Resale Flats or Condominium). For BTOs, we will show all the flats that were released by HDB in 2020. For SRX and Condominium, the map will show all available units listed for sale on SRX. There are 4 main features on the home page.

i. Filtering: Users would decide which BTOs to buy based on the nearby amenities available. Different buyers have different requirements as well. Some would like flats near MRT stations while couples with cars may not be too concerned about the distance of their flat from the MRT station. These couples would prioritise other facilities such as shopping centres or parks. We hence created an additional filter that allows users to select the facilities they want, as well as the maximum distance they are willing to accept. The map will then update to show only units that are within the selected distance for those facilities chosen.

For example, suppose a user is not interested in MRT stations, and would like to search for HDB resale flats that are within 1km of malls, parks, clinics and schools. The intended result is shown below.

Fig.1: HDB Resale Flats That Are Within 1km of Malls, Parks, Clinics, and Schools

ii. View Flat Details: We have created a ‘Select Address’ button for each pop-up which will allow users to click on the button within the marker to see more information. We used the “button onclick” feature within shiny to create this.

At the start, only one unit per marker can be seen. Multiple units on the same location will overlap and the details cannot be selected. However, we know that multiple units in the same block can be put on sale. For example, a BTO can consist of 3-Room, 4-Room and 5-Room flats in one block. To overcome this problem, we modified the code such that multiple flats in a block/condo which shares the same longitude and latitude will only show up as one marker. This will show all the units in the block that are being sold, grouped by characteristics such as number of bedrooms and bathrooms as well as asking price. The grouping will prevent listing of multiple rows of units with the same exact characteristics, which can clutter the screen. To provide further convenience to the users, the display table can be shifted around, and will turn semi-transparent when the mouse is outside the table due to opacity codes. With these added features, users are provided with the flexibility to customise the display to their own liking.

Fig.2: “Select Address” Button and Display Table

iii. Comparing Flats: The amount of data may be overwhelming and users may want to narrow down to a few selected units for more in-depth analysis. We hence created a ‘add to comparison’ button in each pop-up, which will allow users to add interested units to a comparison table, in the next tab, for further analysis. Integrating our application tabs allows for a seamless experience.

iv. Price Ranges: To further value-add to the users, we also showed average price ranges of HDB resale flats between towns through the use of a colour gradient heatmap. In this way, users would immediately know which are the more expensive and which are the cheaper towns, and can narrow down their searches to specific areas based on their budget. This feature can be enabled or disabled with the checkbox on the top right corner, depending on the user’s preference to provide more customisation. Users can also select different map types with the checkbox. From the image below, we can see that HDB resale flats in the north (i.e. Sembawang, Yishun) tend to be a lot cheaper than HDB resale flats in the south (i.e. Bukit Merah, Queenstown).

Fig.3: Price Range of Resale Flats

Finally, our app aims to provide as much customisation as possible, and besides the features already mentioned above (such as the ability to move the comparison and amenities filter tables, and the ability to disable the price range gradient), users are also able to select different types of map to suit their own preferences.

B. Amenities Tab

Since HDB buyers are all couples who are most likely first time buyers, we created an additional “Amenities” tab to help them to have a clearer picture of the type of amenities that are near the HDB that they are planning to bid for. The amenities that HDB buyers can filter for are MRT stations, parks, schools, clinics and malls. Users can also click on “Filter for all amenities on map” to view all amenities.

All these are especially relevant for HDB buyers who are married or soon to be married couples as they are likely to stay at this location as they expand their family. Hence this page serves to help them have a clearer picture of HDB flats which have the specific amenities that they are looking for nearby.

Fig.4: Amenities Tab

C. Comparison Tab

The Comparison tab serves as a useful tool for the user to compare different properties, showing the comparison metrics side-by-side. The metrics that are shown are the Property Name, the Category (Resale or BTO), Neighbourhood (Area in Singapore), Address, Property Type, Postal Code, Number of Bathrooms, Number of Bedrooms, Size (in square metres), Predicted Value, Predicted Value in 5 Years, and Current Price. Only BTO and Resale HDB properties are included, as our app primarily caters to first-time buyers, and they are not likely to compare HDB flats with condominiums.

Fig.5: Comparison Tab

In this interface, users can select the postal code for the Resale HDB flats from the dropdown menu. Users can also type the postal code in if they prefer. When the button “Add Address” is clicked, a row (or a few rows, depending on whether the particular property has many types in one building, like having 2 Rooms, 3 Rooms, or 4 Rooms within the same building) will be added to the dataframe on the right. A row can also be added if the user presses the “Add to Comparison” button in the Home Page tab.

Most buyers would consider future resale value as an important criteria for buying a BTO or Resale flat. Hence, the table from the comparison tab also includes the predicted value today and predicted value 5 years later. The predicted value today is important for buyers who are considering purchasing resale HDB and want to know if it is a good buy or not, by comparing with the asking price. The 5 years predicted value will be important for BTO buyers that want to estimate its resale value after the Minimum Occupancy Period (5 years) as an investment strategy.

The “Clear All” button clears all the rows, and the user can also input rows he wants to delete and press the “Delete Row” button to complete the action. The user can additionally delete by clicking on the row and then pressing the “Delete Row” button.

This allows more flexibility and customisability in comparing different properties. We hope that by listing them together, the user can see all of them at a glance, which brings convenience to his/her process of finding a new home.

D. Prediction Model & Analysis Tab

To allow users to have a better comparison between values of HDB and the variables, an Analysis tab is created. The tab allows users to analyse the effect of variables such size, floor level of house, house type (e.g. 3 rooms), age of house, distance from Amenities (e.g. mrt) against its predicted and current value today. Thus, users will be able to see if their flat is overvalued (price is more than predicted value) or undervalued (price is less than predicted value). This will better inform their purchase decision. Secondly, users can also compare region prices (e.g. Ang Mo Kio) based on predicted value and market/historical value.

Fig.7: Region Price

There is a huge difference between predicted and market values when comparing within a variable (e.g. see below for size of house comparison). This is because the market values are being confounded by other variables, whereas predicted values would normalise other variables (using the coefficient value to estimate). Hence, the predicted values are considered to be more accurate as it truly reflects the value of the house based on that selected variable (ignoring effects by other variables).

Fig.8: Size of House

  1. Size of House Analysis:

The blue points represent resale prices based on sizes of the hdb flats, and the black line represents the average value today based on the prediction model. From this, users can compare how much they should be paying given the size of houses (keeping all the factors constant). Hence, if a user can find a house with price close or below the predicted line, it means that the house is a good buy.

Fig.9

  1. Floor and House Type:

From this chart, users can see that floor level does have an impact on housing prices but it is minimal compared to the impact caused by type of houses (3 rooms, 4 rooms etc). This could be that type of houses are related to sizes of houses. Users can also estimate how much more they should be paying if they high floor compared to low floor, or for different room types.

Fig.10

  1. Age of House:

We can see that prices of flats decrease when the age of houses increases. Moreover, older houses (30 years and above) tend to have a larger price range. The orange line represents the average price based on a prediction model. This serves 2 purposes. Users can see on average, how much the value of their flat is likely to decrease over the years. Secondly, if they are buying a resale flat, they can see based on the house age, how much they should be paying.

Fig.11

  1. Distance from Amenities:

This chart shows how distances from amenities affect the housing prices, and by how much. Users can see that prices of houses increase as they are further away from schools. One possible explanation is that houses further from the schools will be less noisy, hence value increases when houses are further from schools. As distances from MRT and Malls increase, the price decreases. MRT has a much greater impact on house prices which shows that buyers want houses that are close to MRT. From this, users will know that they should take into consideration the distance from MRT when buying the house as this will have a significant impact on its prices.

Fig.12

  1. Region Prices:

This shows the average price (user can select between historical or based on model) of the HDB houses based on regions (e.g. Ang Mo Kio). The darker the color, the more expensive the region. From this chart, users can see that southern regions (bukit timah, Queenstowns) have higher prices.

7. Findings

A. Region Prices:

From the analysis tab, we can see that the price range of houses is huge given the same factor value. For example, given the house size of 95 sqm, the price difference between the min and max price is around $800,000. This is evident from the age of house analysis where min and max prices diff significantly, given the same age. This shows that there is no one single factor that determines the price of the house, as there are numerous confounding factors. Hence, users should take into consideration various factors when estimating the price of the house, and to be cautious that they are paying at a fair value in the resale market.

B. 3 rooms valued less than what it should be:

Looking at the historical price of 3 rooms, it is much less than 4 rooms and above (i.e. the price gap between 3 rooms and 4 rooms is less than 4 rooms and 5 rooms). After thorough research, the team finds that the average size of 3 room flat is 65 sqm. Whereas the average size of 4 rooms and 5 rooms are 95 sqm and 110 sqm. The size increment is about the same (20 sqm) but the value of 3 rooms compared to 4 rooms (Price difference of $200k) seems much less than 4 rooms compared to 5 rooms (price difference of $100k). One possible explanation could be that housing prices are not linear (e.g. follows log distribution instead). Another possible explanation is that sellers place a huge discount on houses below a certain size, and 3 rooms fall below this threshold. Hence, buyers could take advantage of this opportunity as it is more ‘worth’ getting 3 rooms as the price decreases more than it should be (when comparing with size).

8. Further Development

Currently, this application is catered towards first-time flat buyers who are more interested in HDB flats. For future development, the team can look into developing different versions of the application, for instance catered to condominium buyers. If a condominium tool version was developed, the future pricing analysis would differ, and users may also have different requirements - for instance, they may want a comparison with landed properties. Future development of this application can include more facilities such as stadiums, religious sites, hawker centres and carparks to provide a more comprehensive overview for users. Upcoming MRTs and amenities can also be included, which would help users to make better decisions, and make future price analysis even more insightful. This is especially so for non-mature estates such as Tengah, where many new amenities are scheduled to be opened in the next few years.

9. Conclusion

BTO Analysis Buddy brings convenience to users and is designed to be aesthetically pleasing, user-friendly and most importantly - combining filtering, comparison and predictive features that altogether makes the BTO buying process a breeze. Though the BTO analysis buddy was designed mainly for users who are looking for BTO, this application could potentially assist users who are not first time buyers. These users might be looking to purchase a resale flat or condominium which they can move into or to potentially invest in. Moreover, those who are looking to rent a flat could use the “Amenities” tab in BTO analysis buddy to see if there are facilities nearby that can meet their needs.